AITopics | test video

Collaborating Authors

test video

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ad02c6f3824f871395112ae71a28eff7-Paper-Conference.pdf

Neural Information Processing SystemsFeb-11-2026, 07:53:23 GMT

There is no simple heuristic to reliably connect an object and the visualeffectsitgenerates.

artificial intelligence, machine learning, video, (17 more...)

Neural Information Processing Systems

Country: North America > United States > New York (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

96b250a90d3cf0868c83f8c965142d2a-Supplemental.pdf

Neural Information Processing SystemsFeb-10-2026, 02:38:29 GMT

dcvc, memc, test video, (13 more...)

Neural Information Processing Systems

Country: Asia (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.70)
Information Technology > Sensing and Signal Processing > Image Processing (0.51)

Add feedback

Supplementary Material to Deep Contextual Video Compression

Neural Information Processing SystemsFeb-12-2025, 00:50:00 GMT

This document provides the supplementary material to our proposed deep contextual video compression (DCVC), including detailed network structures, training strategies, as well as additional experimental results to demonstrate the effectiveness of the proposed DCVC. The above part is the encoder and the below part is the decoder. For simplification, the entropy model is omitted. ResBlock represents plain residual block. The numbers represent channel dimensions.

artificial intelligence, dcvc, machine learning, (14 more...)

Neural Information Processing Systems

Country: Asia (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.69)
Information Technology > Sensing and Signal Processing > Image Processing (0.51)

Add feedback

Watch Those Words: Video Falsification Detection Using Word-Conditioned Facial Motion

Agarwal, Shruti, Hu, Liwen, Ng, Evonne, Darrell, Trevor, Li, Hao, Rohrbach, Anna

arXiv.org Artificial IntelligenceDec-20-2021

In today's era of digital misinformation, we are increasingly faced with new threats posed by video falsification techniques. Such falsifications range from cheapfakes (e.g., lookalikes or audio dubbing) to deepfakes (e.g., sophisticated AI media synthesis methods), which are becoming perceptually indistinguishable from real videos. To tackle this challenge, we propose a multi-modal semantic forensic approach to discover clues that go beyond detecting discrepancies in visual quality, thereby handling both simpler cheapfakes and visually persuasive deepfakes. In this work, our goal is to verify that the purported person seen in the video is indeed themselves by detecting anomalous correspondences between their facial movements and the words they are saying. We leverage the idea of attribution to learn person-specific biometric patterns that distinguish a given speaker from others. We use interpretable Action Units (AUs) to capture a persons' face and head movement as opposed to deep CNN visual features, and we are the first to use word-conditioned facial motion analysis. Unlike existing person-specific approaches, our method is also effective against attacks that focus on lip manipulation. We further demonstrate our method's effectiveness on a range of fakes not seen in training including those without video manipulation, that were not addressed in prior work.

computer vision, fake video, video, (15 more...)

arXiv.org Artificial Intelligence

2112.10936

Country:

Europe > Belgium (0.04)
Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

End-to-End Deep Learning for Self-Driving Cars

#artificialintelligenceDec-4-2021, 16:01:05 GMT

In a new automotive application, we have used convolutional neural networks (CNNs) to map the raw pixels from a front-facing camera to the steering commands for a self-driving car. This powerful end-to-end approach means that with minimum training data from humans, the system learns to steer, with or without lane markings, on both local roads and highways. The system can also operate in areas with unclear visual guidance such as parking lots or unpaved roads. We designed the end-to-end learning system using an NVIDIA DevBox running Torch 7 for training. An NVIDIA DRIVETM PX self-driving car computer, also with Torch 7, was used to determine where to drive--while operating at 30 frames per second (FPS).

cnn, vehicle, video, (16 more...)

#artificialintelligence

Country:

North America > United States > New Jersey > Monmouth County (0.05)
North America > United States > Pennsylvania (0.04)
North America > United States > New York (0.04)
(2 more...)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Information Technology (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

AGENT: A Benchmark for Core Psychological Reasoning

Shu, Tianmin, Bhandwaldar, Abhishek, Gan, Chuang, Smith, Kevin A., Liu, Shari, Gutfreund, Dan, Spelke, Elizabeth, Tenenbaum, Joshua B., Ullman, Tomer D.

arXiv.org Artificial IntelligenceFeb-25-2021

For machine agents to successfully interact with humans in real-world settings, they will need to develop an understanding of human mental life. Intuitive psychology, the ability to reason about hidden mental variables that drive observable actions, comes naturally to people: even pre-verbal infants can tell agents from objects, expecting agents to act efficiently to achieve goals given constraints. Despite recent interest in machine agents that reason about other agents, it is not clear if such agents learn or hold the core psychology principles that drive human reasoning. Inspired by cognitive development studies on intuitive psychology, we present a benchmark consisting of a large dataset of procedurally generated 3D animations, AGENT (Action, Goal, Efficiency, coNstraint, uTility), structured around four scenarios (goal preferences, action efficiency, unobserved constraints, and cost-reward trade-offs) that probe key concepts of core intuitive psychology. We validate AGENT with human-ratings, propose an evaluation protocol emphasizing generalization, and compare two strong baselines built on Bayesian inverse planning and a Theory of Mind neural network. Our results suggest that to pass the designed tests of core intuitive psychology at human levels, a model must acquire or have built-in representations of how agents plan, combining utility computations and core knowledge of objects and physics.

agent, scenario, video, (15 more...)

arXiv.org Artificial Intelligence

2102.12321

Country:

Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
(2 more...)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning Grammar of Complex Activities via Deep Neural Networks

Mashaido, Becky

arXiv.org Artificial IntelligenceJan-7-2021

Motivated by the growing amount of publicly available video data on online streaming services and an increased interest in applications that analyze continuous video streams such as autonomous driving, this technical report provides a theoretical insight into deep neural networks for video learning, under label constraints. I build upon previous work in video learning for computer vision, make observations on model performance and propose further mechanisms to help improve our observations.

attention mechanism, ebruary 17, regularized model, (11 more...)

arXiv.org Artificial Intelligence

2101.02774

Country: North America > United States > Massachusetts > Suffolk County > Boston (0.05)

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

End-to-End Deep Learning for Self-Driving Cars

#artificialintelligenceAug-21-2016, 15:08:02 GMT

artificial intelligence, machine learning, video, (19 more...)

#artificialintelligence

Country:

North America > United States > New Jersey > Monmouth County (0.05)
North America > United States > Pennsylvania (0.04)
North America > United States > New York (0.04)
(2 more...)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Semantic Concept Discovery for Large-Scale Zero-Shot Event Detection

Chang, Xiaojun (University of Technology Sydney) | Yang, Yi (University of Technology Sydney) | Hauptmann, Alexander (Carnegie Mellon University) | Xing, Eric P (Carnegie Mellon University) | Yu, Yao-Liang (Carnegie Mellon University)

AAAI ConferencesJul-15-2015

We focus on detecting complex events in unconstrained Internet videos. While most existing works rely on the abundance of labeled training data, we consider a more difficult zero-shot setting where no training data is supplied. We first pre-train a number of concept classifiers using data from other sources. Then we evaluate the semantic correlation of each concept w.r.t. the event of interest. After further refinement to take prediction inaccuracy and discriminative power into account, we apply the discovered concept classifiers on all test videos and obtain multiple score vectors. These distinct score vectors are converted into pairwise comparison matrices and the nuclear norm rank aggregation framework is adopted to seek consensus. To address the challenging optimization formulation, we propose an efficient, highly scalable algorithm that is an order of magnitude faster than existing alternatives. Experiments on recent TRECVID datasets verify the superiority of the proposed approach. We focus on detecting complex events in unconstrained Internet videos. While most existing works rely on the abundance of labeled training data, we consider a more difficult zero-shot setting where no training data is supplied.We first pre-train a number of concept classifiers using data from other sources. Then we evaluate the semantic correlation of each concept w.r.t. the event of interest. After further refinement to take prediction inaccuracy and discriminative power into account, we apply the discovered concept classifiers on all test videos and obtain multiple score vectors. These distinct score vectors are converted into pairwise comparison matrices and the nuclear norm rank aggregation framework is adopted to seek consensus. To address the challenging optimization formulation, we propose an efficient, highly scalable algorithm that is an order of magnitude faster than existing alternatives. Experiments on recent TRECVID datasets verify the superiority of the proposed approach

detection, event detection, video, (16 more...)

AAAI Conferences

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.46)

Industry: Government > Regional Government > North America Government > United States Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.83)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.65)

Add feedback